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DETAILED ACTION 

1 . Claims 1-20, 23-26, and 28-33 are pending. 

2. Preliminary amendments to the claims filed on 7/29/2003 have been entered for 
examination. 



Claim Objections 

3. The numbering of claims is not in accordance with 37 CFR 1.126 which requires 
the original numbering of the claims to be preserved throughout the prosecution. When 
claims are canceled, the remaining claims must not be renumbered. When new claims 
are presented, they must be numbered consecutively beginning with the number next 
following the highest numbered claims previously presented (whether entered or not). 

A claim 27 is not present in the amended claims and has been skipped by the 
applicant. Misnumbered claims 28-33 have been renumbered claims 27-32 respectively. 
Correction is required. 



Double Patenting 

4. The nonstatutory double patenting rejection is based on a judicially created 
doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the 
unjustified or improper timewise extension of the "right to exclude" granted by a patent 
and to prevent possible harassment by multiple assignees. A nonstatutory 
obviousness-type double patenting rejection is appropriate where the conflicting claims 
are not identical, but at least one examined application claim is not patentably distinct 



Application/Control Number: 10/629,133 Page 3 

Art Unit: 2168 

from the reference claim(s) because the examined application claim is either anticipated 
by, or would have been obvious over, the reference claim(s). See, e.g., In re Berg, 140 
F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 
USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 
1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 
F.2d 438, 164 USPQ 619 (CCPA 1970); and In re Thorington, 418 F.2d 528, 163 USPQ 
644 (CCPA 1969). 

A timely filed terminal disclaimer in compliance with 37 CFR 1 .321 (c) or 1 .321 (d) 
may be used to overcome an actual or provisional rejection based on a nonstatutory 
double patenting ground provided the conflicting application or patent either is shown to 
be commonly owned with this application, or claims an invention made as a result of 
activities undertaken within the scope of a joint research agreement. 

Effective January 1, 1994, a registered attorney or agent of record may sign a 
terminal disclaimer. A terminal disclaimer signed by the assignee must fully comply 
with 37 CFR 3.73(b). 



5. Claims 1-20 and 23-32 are rejected on the ground of nonstatutory double 
patenting over claims 1-16 of U. S. Patent No. 6,519,557 B1 since the claims, if allowed, 
would improperly extend the "right to exclude" already granted in the patent. 

Claims 1,23, and 28 of the instant application is broader than claim 1 of US 
Patent No. 6,519,557 B1. US Patent No. 6,519,557 B1 discloses a system and method 
for comparing two documents based on the hierarchal data structure of two documents. 

The subject matter claimed in the instant application is fully disclosed in the 
patent and is covered by the patent since the patent and the application are claiming 
common subject matter. 

Furthermore, there is no apparent reason why applicant was prevented from 
presenting claims corresponding to those of the instant application during prosecution of 
the application which matured into a patent. See In re Schneller, 397 F.2d 350, 158 



USPQ 210 (CCPA 1968). See also MPEP § 804. 
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Claim Rejections - 35 USC § 101 

6. 35 U.S.C. 101 reads as follows: 

Whoever invents or discovers any new and useful process, machine, manufacture, or composition of 
matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the 
conditions and requirements of this title. 

7. Claims 1-20 are rejected under 35 U.S.C. 101 because the claimed invention the 
claimed invention lacks patentable utility. The claimed invention does not produce a 
useful, concrete, and tangible result. Claim 1 describes a method for determining a 
degree of similarity between documents, but nowhere in the following dependent claims 
2-20 is the result of the method used in a meaningful way. The dependent claims do not 
appear to add the necessary step(s) to achieve a practical application for the result as 
well, instead storing or generating additional information, not using the result (the 
calculated measure of similarity) in a practical application. Correction is required. 

Claims 23-27 are rejected under 35 U.S.C. 101 because the claimed invention 
the claimed invention lacks patentable utility. The claimed invention does not produce a 
useful, concrete, and tangible result. Claim 23 describes a program storage device to 
perform a method for determining a degree of similarity between documents, but 
nowhere in the following dependent claims 24-27 is the result of the method used in a 
meaningful way. The dependent claims do not appear to add the necessary step(s) to 
achieve a practical application for the result as well, instead storing or generating 
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additional information, not using the result (the calculated measure of similarity) in a 
practical application. Correction is required. 

Claim Rejections - 35 USC §112 

8. The following is a quotation of the second paragraph of 35 U.S.C. 112: 

The specification shall conclude with one or more claims particularly pointing out and distinctly 
claiming the subject matter which the applicant regards as his invention. 

9. Claims 28-32 are rejected under 35 U.S.C. 1 12, second paragraph, as being 
indefinite for failing to particularly point out and distinctly claim the subject matter which 
applicant regards as the invention. The metes and bounds of the claims are unclear, as 
the independent claim refers to a computer system while the dependent claims refer to 
method steps. Correction is required. 

Claims 29-31 recite the limitation "said method" in line 1 of claim 29, 30, and 31. 
There is insufficient antecedent basis for this limitation in the claim. Independent claim 
28refers to a computer system, with no prior mention of a method in the claim. 
Correction is required. 

Claims 2, 24, and 28 recites the limitation "a Document Model Object 
representation" in lines 1-2 of each claim. There is insufficient antecedent basis for this 
limitation in the claim. This term is never used in the specification. For the purpose of 
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examination, the examiner replaces the term with a "Document Object Model 
representation", which is defined in the specification and by pertinent references. 



Claim Rejections - 35 USC § 103 

The following is a quotation of 35 U.S.C. 103(a) which forms the basis for all 
obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set 
forth in section 102 of this title, if the differences between the subject matter sought to be patented and 
the prior art are such that the subject matter as a whole would have been obvious at the time the 
invention was made to a person having ordinary skill in the art to which said subject matter pertains. 
Patentability shall not be negatived by the manner in which the invention was made. 

10. Claims 1-12, 14-20, and 23-32 are rejected under 35 U.S.C. 103(ea) as being 
unpatentable over Hattori et al. ("Hattori" US# 6,889,223 B2) in view of Cruz et al. 
("Measuring Structural Similarity Among Web Documents: Preliminary Results"; Cruz et 
al. 1998; Lecture Notes in Computer Science, volume 1375. page 513.) 

As per claim 1 , Hattori discloses "A method for determining a degree of similarity 
between documents," (see Abstract) "storing, for at least two documents, labeled tree 
representations of respective documents;" (Figures 4-8 and column 8 lines 29-41, 
wherein the document storage is a structured document database that stores labeled, 
hierarchal tree structures of documents) "storing, for at least two documents, path 
representations relating to paths that occur in the documents from root nodes to leaf 
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nodes in the labeled tree representations of the respective documents;" (Figure 9 and 
column 9 lines 24-36, wherein an index storage contains a structure index that has path 
representation information relating to paths that occur in documents according to a tree 
representation) "and calculating a measure of similarity" (column 34 line 63 - column 35 
line 37, wherein similarity values are calculated for documents based on nodes). Hattori 
does not expressly disclose "calculating a measure of similarity between two of the 
documents based upon the frequency of occurrence of similar paths specified by the 
path representations." 

Cruz expressly discloses "calculating a measure of similarity between two of the 
documents based upon the frequency of occurrence of similar paths specified by the 
path representations." (Page 9 paragraphs 3-5, wherein a measure of similarity between 
two documents is measures through the difference between the two documents, as well 
as their similarities, and is analogous). 

It would have been obvious at the time of the invention for one of ordinary skill in 
the art to combine Hattori 's method of comparing documents based on their structure 
with Cruz 's method of calculating the similarity between two documents. This gives the 
user an additional source of information when organizing and querying databases 
holding documents. The motivation for doing so would be to classify documents based 
on their structural similarity and to measure the similarity between them. 
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As per claim 2, Hattori discloses "the tree representation is a Document Model 
Object representation." (column 7 lines 62-64, wherein the tree structure of the 
document is a document object tree model). 

As per claim 3, Hattori discloses "the step of generating a path representation for 
a path of a document as a sequence of labels representative from a root node to a leaf 
node in the labeled tree representation of the document." (column 8 line 66 - column 9 
line 16, wherein a sequence of labels representative of the tree representation are 
stored in a structure index using a structure document pass step). 

As per claim 4, Hattori discloses "the step of storing, as path representations, 
sets of sequenced labels representative of distinct paths in a labeled tree representation 
of a corresponding document." (column 9 lines 17-36, wherein structure index 
information is stored in an index storage that represent distinct paths in a labeled tree 
representation of a document). 

As per claim 5, Hattori discloses "the step of storing a path dictionary 

(Dict.sub.paths=[p sub.1, p.sub.2 p.sub.N]) of distinct paths collated from a tree 

representation for a document." (column 12 lines 29-45, wherein a schema is made hat 
shows the distinct paths of a document from a tree representation and is analogous). 

As per claim 6, Hattori discloses "the step of eliminating selected paths from the 
path dictionary (Dict.sub. paths)." (column 29 lines 27-35, wherein a structure evolving 
device generates a bind table that eliminates selected paths). 

As per claim 7, Hattori discloses "paths that occur highly frequently or highly 
infrequently are eliminated from the path dictionary (Dict.sub. paths)." (column 29 lines 
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40-56, wherein an upper location and lower location evolving device is used by the 
structure evolving device to choose paths to eliminate to make a bind table showing 
paths and similarity). 

As per claim 8, Hattori discloses "the step of computing the frequency of 
occurrence (f.sub.j(p.sub.i)) of a path (p.sub.i) in a document (d.sub.j)." (column 33 lines 
33-55, wherein path frequence is found based on tree level and the bind table). 

As per claim 9, Hattori discloses "the step of computing the maximum number of 
instances (f.sub.max=max.sub.ij f.sub.j(p.sub.i)) in which a path (p.sub.i) in the 
document (d.sub.j) occurs." (column 34 lines 1-19, wherein a final bind table aggregates 
the number of times a path occurs in the document through a structure evolving device). 

As per claim 1 0, Hattori discloses "the step of storing a representation of the 
document (d.sub.j) as a N-dimensional vector ([d.sub.jl, d.sub.j2, . . . , d.sub.jN], where 
d.sub.jk=f.sub.j(p.sub.k)/f.sub.max, 1 < k < N) of relative frequencies of occurrence 
(f.sub.j(p.sub.k)) of paths (p.sub.k) in the document (d.sub.j)." (Figure 9 and column 9 
lines 26-36, wherein the structure index stores information in a vectore showing node 
paths traveled and is analogous). 

As per claim 1 1 , Hattori discloses "the step of computing the minimum number of 
instances (f.sub.min=min.sub.ij f.sub.j(p.sub.i)) in which a path (p.sub.i) in the document 
(d.sub.j) occurs." (column 34 lines 1-19, wherein a final bind table is made using lower 
location evolving and is synonymous). 

As per claim 12, Hattori discloses "the step of computing the similarity between a 
pair of documents (d.sub.i, d.sub.l) as a function (sim(d.sub.i, d.sub.l)) of metrics 
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relating the number of paths common to the respective documents (d.sub.i, d.sub.l)." 
(column 38 line 47 - column 39 line 1 1 , wherein a similarity value is computed using 
path information from the structure of documents). 

As per claim 14, Hattori discloses "the tree representation of a document 
includes a positional index, which represents, for a node (n), the number of previous 
sibling nodes with the same label as that of node (n)." (column 8 lines 42-57, wherein 
the tree representation includes an object ID including linking information). 

As per claim 15, Hattori discloses "the step of storing as a path representation a 
set that defines positional information of sibling nodes under a parent node." (column 8 
lines 58-65, wherein within the structure index is positional information of sibling nodes 
under parent nodes). 

As per claim 16, Hattori discloses "the step of storing precise path 
representations that precisely define a document structure, and generalised path 
representations that partially generalise structural aspects of precise path 
representations of a document." (Figure23 and column 4 line 1-3, wherein the schema 
that forms the bind table precisely defines the document structure and path 
representations). 

As per claim 17, Hattori discloses "the step of calculating the measure of 
similarity involves determining a total number of precise path representations of one 
document that are either shared by the other document, or are a subsumed subset of at 
least one of the generalised path representations of the other document." (column 38 
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lines 29-43, wherein the measure of similarity needs the path representation of 
documents to be calculated). 

As per claim 18, Hattori discloses "the step of normalising the measure of 
similarity by a term that represents the number of unique path representations shared 
by the two documents." (column 38 lines 29-34, wherein the measure of similarity is 
found using the number of levels to have a standard for documents and a similarity 
value is found). 

As per claim 19, Hattori discloses "the number of unique path representations is 
calculated by adding the number of path representations for each document, and 
subtracting from this total the number path representations shared by the two 
documents." (column 39 lines 29-55, wherein the structure components of a document 
are used to aggregate the number of paths and is compared to a target document's 
structure components). 

As per claim 20, Hattori discloses "the step of storing as a path representation a 
sequence of terms separated by a delimiting symbol, in which each term is represented 
by a label and a parenthesised predicate that specifies the positional index of the term 
either specifically or generally." (column 18 lines 30-43 and column 24 lines 45-60, 
wherein the path representation is shown from the structure of the document and is 
stored, including level ID, to show a positional index). 

As per claim 23, Hattori discloses "A program storage device readable by 
computer, tangibly embodying a program of instructions executable by said computer to 
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perform a method for determining a degree of similarity between documents," (see 
Abstract and column 40 lines 30-35) "storing, for at least two documents, labeled tree 
representations of respective documents;" (Figures 4-8 and column 8 lines 29-41, 
wherein the document storage is a structured document database that stores labeled, 
hierarchal tree structures of documents) "storing, for at least two documents, path 
representations relating to paths that occur in the documents from root nodes to leaf 
nodes in the labeled tree representations of the respective documents;" (Figure 9 and 
column 9 lines 24-36, wherein an index storage contains a structure index that has path 
representation information relating to paths that occur in documents according to a tree 
representation) "and calculating a measure of similarity" (column 34 line 63 - column 35 
line 37, wherein similarity values are calculated for documents based on nodes). Hattori 
does not expressly disclose "calculating a measure of similarity between two of the 
documents based upon the frequency of occurrence of similar paths specified by the 
path representations." 

Cruz expressly discloses "calculating a measure of similarity between two of the 
documents based upon the frequency of occurrence of similar paths specified by the 
path representations." (Page 9 paragraphs 3-5, wherein a measure of similarity between 
two documents is measures through the difference between the two documents, as well 
as their similarities, and is analogous). 

It would have been obvious at the time of the invention for one of ordinary skill in 
the art to combine Hattori 's method of comparing documents based on their structure 
with Cruz 's method of calculating the similarity between two documents. This gives the 
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user an additional source of information when organizing and querying databases 
holding documents. The motivation for doing so would be to classify documents based 
on their structural similarity and to measure the similarity between them. 

As per claim 24, Hattori discloses "the tree representation is a Document Model 
Object representation." (column 7 lines 62-64, wherein the tree structure of the 
document is a document object tree model). 

As per claim 25, Hattori discloses "the step of generating a path representation 
for a path of a document as a sequence of labels representative from a root node to a 
leaf node in the labeled tree representation of the document." (column 8 line 66 - 
column 9 line 16, wherein a sequence of labels representative of the tree representation 
are stored in a structure index using a structure document pass step). 

As per claim 26, Hattori discloses "the step of storing, as path representations, 
sets of sequenced labels representative of distinct paths in a labeled tree representation 
of a corresponding document." (column 9 lines 17-36, wherein structure index 
information is stored in an index storage that represent distinct paths in a labeled tree 
representation of a document). 

As per claim 27, Hattori discloses "the tree representation of a document 
includes a positional index, which represents, for a node (n), the number of previous 
sibling nodes with the same label as that of node (n)." (column 8 lines 42-57, wherein 
the tree representation includes an object ID including linking information). 
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As per claim 28, Hattori discloses "A computer system operable for determining a 
degree of similarity between documents," (see Abstract) "storing, for at least two 
documents, labeled tree representations of respective documents;" (Figures 4-8 and 
column 8 lines 29-41, wherein the document storage is a structured document database 
that stores labeled, hierarchal tree structures of documents) "storing, for at least two 
documents, path representations relating to paths that occur in the documents from root 
nodes to leaf nodes in the labeled tree representations of the respective documents;" 
(Figure 9 and column 9 lines 24-36, wherein an index storage contains a structure index 
that has path representation information relating to paths that occur in documents 
according to a tree representation) "and calculating a measure of similarity" (column 34 
line 63 - column 35 line 37, wherein similarity values are calculated for documents 
based on nodes). Hattori does not expressly disclose "calculating a measure of 
similarity between two of the documents based upon the frequency of occurrence of 
similar paths specified by the path representations." 

Cruz expressly discloses "calculating a measure of similarity between two of the 
documents based upon the frequency of occurrence of similar paths specified by the 
path representations." (Page 9 paragraphs 3-5, wherein a measure of similarity between 
two documents is measures through the difference between the two documents, as well 
as their similarities, and is analogous). 

It would have been obvious at the time of the invention for one of ordinary skill in 
the art to combine Hattori 's method of comparing documents based on their structure 
with Cruz 's method of calculating the similarity between two documents. This gives the 



Application/Control Number: 10/629,133 Page 15 

Art Unit: 2168 

user an additional source of information when organizing and querying databases 
holding documents. The motivation for doing so would be to classify documents based 
on their structural similarity and to measure the similarity between them. 

As per claim 29, Hattori discloses "the tree representation is a Document Model 
Object representation." (column 7 lines 62-64, wherein the tree structure of the 
document is a document object tree model). 

As per claim 30, Hattori discloses "the step of generating a path representation 
for a path of a document as a sequence of labels representative from a root node to a 
leaf node in the labeled tree representation of the document." (column 8 line 66 - 
column 9 line 16, wherein a sequence of labels representative of the tree representation 
are stored in a structure index using a structure document pass step). 

As per claim 31 , Hattori discloses "the step of storing, as path representations, 
sets of sequenced labels representative of distinct paths in a labeled tree representation 
of a corresponding document." (column 9 lines 17-36, wherein structure index 
information is stored in an index storage that represent distinct paths in a labeled tree 
representation of a document). 

As per claim 32, Hattori discloses "the tree representation of a document 
includes a positional index, which represents, for a node (n), the number of previous 
sibling nodes with the same label as that of node (n)." (column 8 lines 42-57, wherein 
the tree representation includes an object ID including linking information). 
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11. Claim 13 is rejected under 35 U.S.C. 103(a) as being unpatentable over Hattori 
et al. ("Hattori" US# 6,889,223 B2), in view of Cruz et al. ("Measuring Structural 
Similarity Among Web Documents: Preliminary Results"; Cruz et al. 1998; Lecture 
Notes in Computer Science, volume 1375. page 513.) and further in view of Schuetze et 
al. ("Schuetze" US# 6,598,054 B2). 

As per claim 13, Hattori and Cruz do not expressly disclose "the function for 
computing the similarity between a pair of documents (di, dl) ... is the quotient of a 
numerator, defined as the sum for all paths (k=1 ... N) of the minimum number of 
instances (min(dik, dlk)) in which paths occur in the respective documents (di, dl), and a 
denominator, defined as the sum for all paths (k=1 . . . N) of the maximum number of 
instances (min(dik, dlk)) in which paths occur in the respective documents (di, dl)." 

Schuetze discloses "the function for computing the similarity between a pair of 
documents (di, dl) ... is the quotient of a numerator, defined as the sum for all paths 
(k=1 . . . N) of the minimum number of instances (min(dik, dlk)) in which paths occur in 
the respective documents (di, dl), and a denominator, defined as the sum for all paths 
(k=1 ... N) of the maximum number of instances (min(dik, dlk)) in which paths occur in 
the respective documents (di, dl)." (column 13 lines 13-28 and column 16 line 1-5, 
wherein the exact formula is used to show similarities between two documents, with d1 
and d2 being two separate documents being compared). 

It would have been obvious at the time of the invention for one of ordinary skill in 
the art to combine Hattori 's method of comparing documents based on their structure 
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and Cruz 's method of calculating the similarity between two documents with Schuetze 's 
method of comparing two documents using a specific formula to find an exact measure 
of similarity. This gives the user a more precise and efficient way to measure similarities 
between documents. The motivation for doing so would be to decrease the difficulty in 
navigating document collections and to be able to perform searches based on the 
characteristics of the documents. 



Conclusion 

12. The prior art made of record and not relied upon is considered pertinent to 
applicant's disclosure. 

Sanfilippo (US Pub 2003/0028564 A1) 

Nakayama et al. (US # 6,622139 B1) 

Wheeler et al. (US # 6,738,759 B1) 

Gansky et al. (US Pub 2004/0205454 A1) 

Bernstein et al. (US # 6,826,568 B2) 

Su et al. (US # 6,845,380 B2) 

Igata (US # 6,853,992 B2) 

Aiken (US # 6,658,626 B1) 

Manber et al. (US # 6,920,609 B1) 

Bharat et al. (US # 6,487,555 B1) 
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Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to Dangelino N. Gortayo whose telephone number is 
(571)272-7204. The examiner can normally be reached on M-F 7:30-4:30. 

If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, Jeffrey A. Gaffin can be reached on (571)272-4146. The fax phone number 
for the organization where this application or proceeding is assigned is 571-273-8300. 

Information regarding the status of an application may be obtained from the 
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